Search CORE

468 research outputs found

Establishing a Fusion Model of Attention Mechanism and Generative Adversarial Network to Estimate Students\u27 Attitudes in English Classes

Author: Song Tianyi
Zhao Tong
Publication venue: 'Mechanical Engineering Faculty in Slavonski Brod'
Publication date: 01/01/2022
Field of study

With the rapid development of science and technology, artificial intelligence has been widely used in various fields and a new model of AI-aided education has been developed in the new era. In the education industry, AI-aided education can save teachers\u27 energy, improve teaching efficiency and help to refine teaching methods. In order to estimate students\u27 attitudes towards English teachers\u27 lectures, this paper proposed an AI-aided feedback system. In the constructed system, DG-Net was used to expand the data sets of students, and combined with Attention\u27s Alphapose model to collect students\u27 listening poses. The whole model provided feedback of students\u27 listening postures in English speaking and listening classes, assisting teachers to estimate students\u27 attitudes through data analysis and realizing AI-aided education in English classes

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention

Author: Deng Yichuan
Song Zhao
Zhou Tianyi
Publication venue
Publication date: 17/10/2023
Field of study

Large transformer models have achieved state-of-the-art results in numerous natural language processing tasks. Among the pivotal components of the transformer architecture, the attention mechanism plays a crucial role in capturing token interactions within sequences through the utilization of softmax function. Conversely, linear attention presents a more computationally efficient alternative by approximating the softmax operation with linear complexity. However, it exhibits substantial performance degradation when compared to the traditional softmax attention mechanism. In this paper, we bridge the gap in our theoretical understanding of the reasons behind the practical performance gap between softmax and linear attention. By conducting a comprehensive comparative analysis of these two attention mechanisms, we shed light on the underlying reasons for why softmax attention outperforms linear attention in most scenarios

arXiv.org e-Print Archive

Solving Regularized Exp, Cosh and Sinh Regression Problems

Author: Li Zhihang
Song Zhao
Zhou Tianyi
Publication venue
Publication date: 28/03/2023
Field of study

In modern machine learning, attention computation is a fundamental task for training large language models such as Transformer, GPT-4 and ChatGPT. In this work, we study exponential regression problem which is inspired by the softmax/exp unit in the attention mechanism in large language models. The standard exponential regression is non-convex. We study the regularization version of exponential regression problem which is a convex problem. We use approximate newton method to solve in input sparsity time. Formally, in this problem, one is given matrix

A \in \mathbb{R}^{n \times d}

b \in \mathbb{R}^n

w \in \mathbb{R}^n

and any of functions

\exp, \cosh

and

\sinh

denoted as

f

. The goal is to find the optimal

x

that minimize

0.5 \| f(Ax) - b \|_2^2 + 0.5 \| \mathrm{diag}(w) A x \|_2^2

. The straightforward method is to use the naive Newton's method. Let

\mathrm{nnz}(A)

denote the number of non-zeros entries in matrix

A

. Let

\omega

denote the exponent of matrix multiplication. Currently,

\omega \approx 2.373

. Let

\epsilon

denote the accuracy error. In this paper, we make use of the input sparsity and purpose an algorithm that use

\log ( \|x_0 - x^*\|_2 / \epsilon)

iterations and

\widetilde{O}(\mathrm{nnz}(A) + d^{\omega} )

per iteration time to solve the problem

arXiv.org e-Print Archive

A Mathematical Abstraction for Balancing the Trade-off Between Creativity and Reality in Large Language Models

Author: Sinha Ritwik
Song Zhao
Zhou Tianyi
Publication venue
Publication date: 04/06/2023
Field of study

Large Language Models have become popular for their remarkable capabilities in human-oriented tasks and traditional natural language processing tasks. Its efficient functioning is attributed to the attention mechanism in the Transformer architecture, enabling it to concentrate on particular aspects of the input. LLMs are increasingly being used in domains such as generating prose, poetry or art, which require the model to be creative (e.g. Adobe firefly). LLMs possess advanced language generation abilities that enable them to generate distinctive and captivating content. This utilization of LLMs in generating narratives shows their flexibility and potential for use in domains that extend beyond conventional natural language processing duties. In different contexts, we may expect the LLM to generate factually correct answers, that match reality; e.g., question-answering systems or online assistants. In such situations, being correct is critical to LLMs being trusted in practice. The Bing Chatbot provides its users with the flexibility to select one of the three output modes: creative, balanced, and precise. Each mode emphasizes creativity and factual accuracy differently. In this work, we provide a mathematical abstraction to describe creativity and reality based on certain losses. A model trained on these losses balances the trade-off between the creativity and reality of the model

arXiv.org e-Print Archive

SDHome: Securing Fast Home Networks

Author: Batula Christopher
Gordon Holden
Zhao Tianyi
Publication venue: Scholar Commons
Publication date: 15/06/2020
Field of study

Distributed denial of service (DDoS) is a highly discussed network attack in Software Defined Networks. Attacks such as the Mirai Botnet threaten to compromise portion of large networks, including home users. Today, corporations secure their network using enterprise level software to protest their network from DDoS attacks . But their solutions are meant for large networks and depend on expensive hardware. There are few security solutions for home users and most are expensive or require a subscription for full protection. In this paper, we propose a new solution in the form of a plug and play device that will allow home users to easily take control of their network. We will be using the SON controller Faucet and the protocol OpenFlow 1.3 to enable software defined functionalities. In addition to more basic network features such as blocking websites, the device will allow users to receive notifications about possible malicious activities on their network, generate device profiles for all devices on the network, and automatically detect and mitigate flooding attacks using a random forest classifier. We implement our network virtually using Graphic Network Simulator 3

Scholar Commons - Santa Clara University

Temperature dependent filtration rates and 13C-NMR-enrichment analysis of substrate utilization in Pecten maximus

Author: Zhao Tianyi
Publication venue
Publication date: 01/09/2018
Field of study

With the seawater temperature rising more than 1.5°C (IPCC) from the pre-industrial time, marine organisms are facing more and more severe climate changes. As temperature is an important factor influencing the physiology of animals, species specific adaptations has been well observed. Subtidal species are one of the most seawater temperature influenced animals. In previous researches, NMR metabolic profiling has been proved to be a decent technique of animal physiological studies. In this work, the king scallop, Pecten maximus was studied to test if (1) consuming labeled phytoplankton would be a stable way of 13C labeling marine filter feeders such as scallops; (2) the metabolism of P. maximus would also change with increasing temperature, which reflects as the different filtration rates from the outside and changing metabolic pathway inside organs. The scallop P. maximus were incubated under two different temperatures, 15°C and 20°C, fed with 13C labeled diatom Phaeodactylum tricornutum. After three days’ filtration rate measurement, the tissue samples of digestive gland and striated adductor muscle were dissected and extracted. Both qualitatively and quantitatively metabolic profiling was done via 13C NMR analyzation. The performance of experiment animal, Pecten maximus were quite different under two temperature treatments. Higher filtration rate was observed at 20°C whereas faster digestion and incorporation of algal lipids was also found inside the digestive gland from 20°C treatment. As for the muscle tissues, incorporation of 13C labeling was observed in both temperature groups, proving this labeling technique is applicable for marine filter feeders

Electronic Publication Information Center

Pseudo-Zero Velocity Re-Detection Double Threshold Zero-Velocity Update Method for Inertial Sensor-Based Pedestrian

Author: Zhao Tianyi
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/07/2021
Field of study

Zero-velocity update method is widely used in inertial measurement unit based pedestrian navigation systems for mitigating sensor drifting error. In the basic pedestrian dead reckoning system, especially in a foot-tie PDR system, zero-velocity update method and a Kalman filter are two core algorithms. In the basic PDR system, ZUPT usually uses a single threshold to judge the gait of pedestrians. A single threshold, however, makes ZUPT unable to accurately judge the gait of pedestrians in different road conditions. In this thesis paper, we propose a new, redesigned zero-velocity update method without using additional equipment and filter algorithms to further improve the accuracy of the correction results. The method uses a sliding detection algorithm to help re-detect the zero-velocity intervals, aiming to remove the pseudo-zero velocity interval and the pseudo-motion interval, as well as improving the performance of the ZUPT method. The method was implemented in a shoe-mounted IMU-based navigation system. For 3-6 km/h walking speed step detection tests, the accuracy of the proposed ZUPT method has an average 23.7% higher than the conventional methods. In a long-distance walking path tracking test, the mean error of the estimated path for our method is 0.61 m, which is an 81.69% reduction compared to the conventional ZUPT methods. The details of the improved ZUPT method presented in this paper not only enables the tracking technology to better track a pedestrian\u27s step changes during walking, but also provides better calculation conditions for subsequent filter operations

Scholarship at UWindsor